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Abstract: Total RNA was isolated from mycelium of T. harzianum by Total RNA extraction kit, and two clear bands of rRNA (28S and 18S) 
were observed in agarose electrophoresis. By joining the 3'end sequence with the known SA76 EST from cDNA library of T. harzianum , a 
full-length cDNA sequence of 2019bp was obtained, whose open reading frame contained 1593bp, a stop codon TAA, a 5'untranslated re¬ 
gion (5TJTR) of 266bp, a 3'untranslated region (3'UTR) of 201 bp, and poly (A) 29 encoded a protein of 530 amino acids, had a signal pep¬ 
tide. T. harzianum shared 53% identity of secreted aspartic proteinase gene with G. zeae, 37% with N. crassa and 36% with C. globosum. 
The full-length cDNA sequence of secreted aspartic proteinase gene from T. harzianum was cloned for the first time by using BD SMART 
RACE technique, which provides a foundation to obtain and validate functional genes of T. harzianum. 
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Introduction 

T. harzianum species are commercially applied as biological 
control agents against a large number of plant fungal pathogens 
with different mechanisms, such as the production of antifungal 
metabolites, competition for space and nutrients and mycopara- 
sitism (Harman and Bjorkman 1998; Manczinger and Polner 
1985). Another activity associated with biocontrol function can 
be used with a protease to inactivate some enzymes that are se¬ 
creted by pathogenic fungi. It prevents effectively fungal patho¬ 
gens from penetrating the plant organisms. For instance, a hy¬ 
drolytic enzyme playing an important role in mycoparasitism 
was purified and biochemically characterized by serine-protease 
(Geremia et al. 1993). However, very few studies were related to 
the protease expression by antipathogenic fungi (Nathalie et al. 
2001 ). 

To explore the mechanism of biocontrol agents, a cDNA li¬ 
brary from T. harzianum mycelium was constructed and 3298 
ESTs were acquired after sequencing (Liu and Yang 2005). A 
partial cDNA sequence of secreted aspartic proteinase gene 
(SA76) has been obtained by step-walking sequencing. Rapid 
amplification of cDNA ends (RACE) is one of the main methods 
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for extending partially known exon sequence and cloning the 
full-length cDNA (Schaefer 1995). It has been widely used in 
further extension of functional fragments in genes of interest 
especially the cloning of the full-length cDNAs. Acquisition of 
full-length cDNA from EST is important to genomics and func¬ 
tional genomics (Louie and Patricia 1999; Zheng and James 
2000). RACE is one of research ways in this field. BD SMART 
RACE is usually used to amplify the 3' ends of SA76. 

In this study, the full-length SA76 gene was cloned firstly 
from mycelium of T. harzianum by BD SMART RACE tech¬ 
nique, which provided the foundation for further study on struc¬ 
ture and function of SA76. Moreover, we can further know the 
model of secreted aspartic proteinase hydrolytic pathway and its 
bio-control mechanism. 

Materials and methods 

Materials 

All reagents were of the highest purity available commercially. 
AdvantageR® 2 PCR Enzyme System®, SMART™ PCR cDNA 
Synthesis Kit, BD SMART™ RACE cDNA Amplification Kit 
and Nucleo Trap® Nucleic Acid Purification Kits were from 
Clontech (Clontech Laboratories, Palo Alto,CA). TA cloning Kit 
from TaKaRa. Total RNA extracted kit from Watson (Watson 
Biotechnologies, INC) 

Methods 

Total RNA extraction 

Total RNA was extracted from mycelium of T. harzianum using 
Total RNA extracted kit (Watson). 

Primers of 3 ' RACE 
3'-RACE CDS Primer A: 

5'-AAGCAGTGGTATCAACGCAGAGTAC(T) 30 VN-3' 

(N = A, C, G, or T; V = A, G, or C) 

10X Universal Primer A Mix (UPM): 
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Long: 

5 '-CTA ATACG ACT C ACTATAGGGC A AGC AGT GGTAT C A A 
CGCAGAGT-3' 

Short: 5'-CTAATACGACTCACTATAGGGC-3' 

Nested Universal Primer A (NUP): 

5 '-A AGC AGT GGTAT C A ACGC AG AGT-3' 

GSP3: 5'CCAACCGTACAGCTTGCTGCTCAACAC3' 

NGSP3: 5'AGCCCTCATCTATGGCGGTCTTGATCGCAG3' 

Primers of verified RACE results 

FI: 5'CCCAAGCTTTCTTCCGTCTCTTTGTTCCTCTTTC3' 

R1: 5'CCGGAATTCATGGTCCTCTCGCTCGTCAATATAT3' 

RACE PCR and cloning the full-length cDNAs 
The 3 ’ ends of the transcripts were amplified by rapid amplifica- 
tion of the cDNA ends (RACE) using the BD SMART™ RACE 
cDNA Amplification Kit according to the manufacturer’s in¬ 
structions (Clontech). The double-stranded cDNAs were pre¬ 
pared from total RNAs of mycelium from Trichoderma har- 
zianum. Primers were designed according to the sequences of 
cDNA clones above. The first PCR reaction was carried out in a 
Gene Amp® PCR system 9700, ‘Touchdown’ 

PCR cycling conditions were 94°C, 30s, 72 °C, 3 min, 6 cycles; 
94 °C, 30 s, 70 °C, 30s, 72 °C, 3 min, 6 cycles; 94 °C, 30 s, 68 °C, 
30s, 72 °C, 3 min 25 cycles. In the second PCR reaction, an ali¬ 
quot of the primary PCR product is reamplified using the inner 
primers. Conditions for PCR amplification were 94 °C for 2 min; 
30 cycles at 94°C for 30 s, 68°C for 30 s and 72°C for 2 min; 
followed by a final extension at 72 °C for 10 min. 

PCR products were analyzed by agarose gel electrophoresis, 
cloned using the TA cloning kit. The inserts were sequenced. 

Results 

Total RNA isolation 

Total RNA was isolated from mycelium of T. harzianum by Total 
RNA extraction kit, and two clear bands of rRNA (28S and 18S) 
were observed in agarose electrophoresis (Fig. 1). The brightness 
of 28S rRNA was nearly 2 times that of 18S rRNA. In addition, 
the ratio of OD260/OD280 was 1.98. Result showed that the 
RNA was very little degraded and the purity of RNA was high. 
Moreover, it also suggested that total RNA had been used suc¬ 
cessfully in reverse transcription of T. harzianum cDNAs. 
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Fig. 1: Total RNA from T. harzianum 
Cloning of SA76 gene by BD SMART RACE Technique 
The results of 3'RACE 

The 1 % agarose gel electrophoresis for the first PCR products 
showed no specific band but a smear was found (Fig. 2, Lanel). 
Two bands of about 1200bp and 700bp were showed in the elec¬ 
trophoresis for the second PCR products using inner primer (Fig. 



3, Lane 1). The band of 1200bp was target sequence. 
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Fig. 2: 3'RACE product of the first PCR 

Lane M: DL2000 DNA marker (TaKaRa Japan); Lane 1: 3'RACE product of 
the first PCR amplifications 
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Fig. 3: 3'RACE product of the second PCR 

Lane M: DL2000 DNA marker; Lane 1: 3'RACE product of the of the sec¬ 
ond PCR amplifications 

Cloning of full-length cDNA 

Sequencing results showed that 3'RACE products were 1233bp 
excluding vector sequence. A full-length cDNA of 2019bp was 
produced by joining 3'fragments with the known SA76 EST. The 
specific primers of 5' and 3' gene were designed to verify conju¬ 
gation product. Analysis of 1 % agarose gel electrophoresis 
showed that the resulting fragment was approximately 1745bp in 
length (Fig. 4). It suggested that the full-length cDNA had been 
obtained successfully. 



Fig. 4: verified the full-length cDNA with specific primers 

Lane M: DL2000 DNA marker (TaKaRa, Japan); Lane 1: PCR product. 

Sequence analysis of full-length cDNA 

By joining the 3'end sequence from cDNA library of T. har¬ 
zianum , a full-length cDNA sequence of 2019bp was acquired 
(Fig. 5), whose open reading frame contained 1593bp, a stop 
codon TAA, a 5'untranslated region (5'UTR) of 226bp, a 
3'untranslated region (3'UTR) of 201 bp and poly (A) 29 encod¬ 
ing a protein of 530 amino acids, with a theoretical molecular 
weight of 55.4kD and a calculated pi of 4.35. The nucleotide 
sequence data has been submitted to GenBank (accession num¬ 
ber: EF063645). 
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1 TCGAATTCCGACGTTCCCCCGCTAGTGAGATCGCGCCGAGATCTGGGTTAATCACTCTGT 

61 TGACTCGATTGCCTTCAGCTGGGACTCGATTTTCCTCTTTCTCTTTTCTCTAGCTCCCCC 

121 CTTCTTCCGTCTCTTTGTTCCTCTTTCC ACCCGCCGG AACCG AT ATCGTTGCG AAC AC AG 

181 GG AG AC AG AG AG ACG AG AAGG AG AGT CG AC AG AACCTTCGC AC ACC AT G AGGCT G ACG AC 

M R L T T 

241 AGCGCTCGGGCTGCTGATCGCGGCGCAACATGCGGAAGCCGTTGTCACTCCGATGTTCCC 

ALGLLIAAQ HAEA VVTPMFP 

301 GCGGGCGG AAT CT GGT GAT GG AT ATTT GTCG ATCCCCGT GGG AACC AT C AAG AGGCCTC A 

RAESGDGYL SIPV GTIKRPH 
361 C AAC AAGGTT GG AAAG AG AAGCGCC ATT G ACGC AGT ATT GG AG AAT AT GG ATTTCTTCT A 

NKVGKRSAI DAVL ENMDFFY 
421 TGCCATCGAAATCGGCCTAGGAACTCCTCCCCAGAACGTAACTGTCCTCGTCGATACAGG 

AIEIGFGTP PQNV TVFVDTG 
481 ATCCAGCGAGCTATGGGTCAATCCGGACTGCTCGACCGCACCGTCCGAGTCGCAGGCCGA 

SSEFWVNPD CSTA PSESQAE 
541 AC AGT GT C AGC AGCT CGGCC AAT AC AATCCC AG A AG AT CG AG AACGCCGCCGGTT GGTCC 

QCQQFGQYN PRRS RTPPVGP 
601 GTTTGGACGCGAGGAAATCAACTATGGCGACCCAACAGACCAGTCCACGCAGACGTCAGT 

FGREEINYG DPTD QSTQTSV 
661 CGACATCACCTACTATGCCGACACGCTGAGCTTTGGTAGGAGTCAGGTCAAGAATCAGAC 

DITYYADTF SFGR SQVKNQT 
721 GTTT GGCGTT GTT ACGTCC AGCG AGGGCC AGGC AC AGGGC AT CAT GGGCCT CGCGCCT G A 

FGVVTSSEG QAQG IMGFAPD 
781 T GTTCG AGG AGG ATTT CC AGGCG ACC AACCGT AC AGCTT GCT GCTC A AC AC AAT GGCCG A 

VRGGFPGDQ PYSF FFNTMAD 
841 CCAGGGAGTCATTGCCAGCCGGGTCTTTTCCCTTGACCTCCGGCATTCCGATTCAGAGAC 

QGVIASRVF SFDF RHSDSET 
901 GGGAGCCCTCATCTATGGCGGTCTTGATCGCAGCAAATTCATCGGTTCCCTCGAGACTCG 

GAFIYGGFD RSKF IGSFETR 
961 ACCCATCGTACCCGGCATCCAAGGCGAAACACGTCTGGCCGTAAATCTGACTACACTGGG 

PIVPGIQGE TRFA VNFTTFG 
1021 CCTC ACGC AAAGCCGTTCGC AG AGCTTC AGGCTG AAC AGCGCCG AC AC AAACGTG ATGCT 

FTQSRSQSF RFNS ADTNVMF 

1081 CGACTCTGGCACGACGCTC AGCCGC ATGC ACTCCGCTGCCGCATCGCCTATCCTCGAGAC 

DSGTTFSRM H S A A ASPIFET 

1141 T CT GGGCGCCC AAAACG AT GGCG AGGGCT ACTTTTTT GT GCCGT GCT CGCTT CGT G ACTC 

EGA QND GEG YFFV PCSLRDS 
1201 CGCTGGC AGTGTCG ATTTTGGCTTCGGC AAC AAGGTC ATC AGGGTTCCCTTCTCTG ATTT 

AGSVDFGFG NKVI RVPFSDF 
1261 CATCCTATCAGCAGGCGATAGTGGCGGCCCCAGCGACTATTGTTATGTTGGCCTGGTCCT 

ILSAGDSGG PSDY CYVGLVL 

1321 GACGACGGACCAGCAGATTCTGGGAGACACGGTGTTGAGGGCTGGATACTTTGTATTCGA 

TTDQQILGD TVLR AGYFVFD 

1381 TTGGG AC AACC AGG AGGTTC AC ATCGCCC AGGCCGCTG ACTGTGGC AGC AGTG AC ATTGT 

WDNQEVHIA Q A A D CGSSDIV 
1441 TGTCGCCGGC AGCGG ATCC AAGGCCGTGCCC AATGTGC AGGGC AATTGC AAC AGC AGTG A 

VAGSGSKAV PNVQ GNCNSSD 
1501 TGCCGGTGTCACGGGC AC AGGAGGCCCAACGGCC ACGGGATCGACCCCGACTCCC ACC AA 

AGVTGTGGP TATG STPTPTN 

1561 CAATATCCCAGCTACAGCCGTTACAACTGTCTTCACGGTGACATCCTGCCCAGTTTTCGA 

NIPATAVTT VFTV TSCPVFD 
1621 TGTCGGATGCCGCACCGGC ATGATCACAACCCAAACC ATTC AAGGAGCTGAAGCTACGCC 

VGCRTGMIT TQTI QGAEATP 

1681 TCAACCCAGCGCTACCAGCACTCCTAGTAGCGGCGGCAATGGAGATGGAGGAGACGAAGA 

QPS ATS TPS SGGN GDGGDED 
1741 TGC AGGCGTGCGGCCTCCGGCCTTG ACTTGGGTCTTTGTCGCGCTGGGTACTTTGGC AAT 

AGV RPP ALT WVFV ALGTLAM 
1801 GATTTTTAACATTGTA TAA TTTCAAGAAGGGGTCTCTTACTGTATATATTGACGAGCGAG 

I F N I V * 

1861 AGGACCATTGGCCATCAATTTTGGATAACTCGGCCGAGGATTTACATACCAGGTTCGTAT 

1921 ATGACGATGGAGGGTGGGAC ATAGCTGCGC ATGGAGG ATTTATTTTCCC AATGG ATACCT 

1981 G AC AT AC AT A A A A A A AAA A A A A A A A A A A A A A A A A A A A A A 

Fig. 5 Nucleotide sequence of cDNA and its deduced amino acid sequence. 
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Analysis of the amino acid sequence by SignalP v3.0 identi¬ 
fied the existence of signal-sequence site, it was inferred from 
this result that it was a kind of secreted protein. Two possible 
cleavage sites of the signal sequence could be detected between 
amino acid positionl8 and 19(A, V) (Fig. 6). 

BlastP analysis against non-redundant protein sequence data¬ 
base revealed that there was a high degree of similarity between 
SA76 and secreted aspartic proteinase belonging to the Eu¬ 
karyotic aspartyl protease family (Fig.7). Homologous analysis 


of the deduced amino acid sequences was perfonned by BLASX 
and subsequently compared with GenBank data. Result indicated 
that the 530 amino acid residues of G. zeae shared 53% identity 
of secreted aspartic proteinase gene, and, the shared identities of 
N. crassa and C. globosum were 37% and 36%, respectively. 
Alignment of SA76 sequence with the closest protease allowed 
us to identify the residues that were necessary for a functionally 
active gene. 


SignalP—NN prediction (euk networks): 
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Fig.6 Signal peptide prediction 
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Fig. 7 Conserved domain prediction of SA76 gene 
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Fig. 8 Phylogenetic tree of SA76 gene from T. harzianum 

As shown in Fig. 8, the phylogenetic tree exhibited only se¬ 
quences that were the very high degree of similarity with SA76 
from BlastP analysis. The emphasis was mainly focused on se¬ 
creted aspartic proteinase from filamentous fungi, and the other 


typical sequences were also included in constructing the Phy¬ 
logenetic tree. Fourteen sequences encoding secreted aspartic 
proteinase were selected and aligned using the ClustalX 1.81 
program. A neighbour-joining tree was constructed using the 
Tree view. The phylogenetic analysis suggested that SA76 protein 
of T. harzianum clustered within G. zeae, N. crassa, C. globosum 
and M. grisea. The tree showed that T. harzianum shared the 
highest identity with G. zeae , followed by N. crassa, C. globo¬ 
sum and M. grisea. It shared the lowest identity with C. immitis 
as compared with the others. 

Discussion 

The BD SMART RACE technique is a newly developed 
PCR-based method for obtaining full-length cDNA. A complete 
5' and 3’ sequence of the target transcript can be isolated more 
consistently (Chenchik et al. 1995; Chenchik et al. 1996). 
Gene-Specific Primers plays an important role in RACE tech¬ 
nique. All the primers should be 23-2 8nt long and primers longer 
than 30nt were generally no advantage. A necessary condition 
for GSPs was GC content of 50%-70% and Tm at least 65°C, 
whenever the Tm can be above 70°C. Longer primers with an¬ 
nealing temperatures above 70°C gave more robust amplification 
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in RACE. Tm over 70°C allows using “touchdown PCR”. How¬ 
ever, the use of self-complementary primer sequences should be 
avoided, which easily resulted in fold-back and formed in¬ 
tramolecular hydrogen bonds. Similarly, the use of primer com¬ 
plementarity in the Universal Primer Mix, particularly in their 3' 
ends should also be avoided. In 3' RACE, no specific band, only 
a smear was found in the products of the first PCR, while two 
bands were founded in the second PCR products. One was target 
band and the other was non-specific band verified by sequencing. 
It was inferred from this result that nest PCR was a useful 
method of RACE. 

SA76 protein of T. harzianum is close to that of G. zeae, N. 
crassa, C. globosum and M. grisea. Most secreted aspartic pro¬ 
teinase from Monilales and Deuteromycotina fungi, also includ¬ 
ing those from T. harzianum and Magnaporthe, were in the same 
clade group. In contrast, Gibberella (Sphaeriales), Neurospora 
(Sphaeriales) and Chaetomium (Chaetomiales) belonged to As- 
comycota. The classification of fungi was uncertainties, accord¬ 
ing to the typical characteristics some fungi belonged to Deu¬ 
teromycotina whereas some fungi were classified to other sub¬ 
phylum. Thus phylogenetic analysis of SA76 gene with the clos¬ 
est secreted aspartic proteinases allowed us to identify its footing 
in evolving and to ensure the relationship with other fungi, con¬ 
sequently ascertain status of T. harzianum. The foundation was 
well-established to obtain and validate functional genes of T. 
harzianum. 

The deduced amino acids shared 53% identity with secreted 
aspartic proteinase, a biocontrol protease gene of G. zeae , but no 
similar genes about T. harzianum were reported by the BLAST 
search. In our experiment, the secreted aspartic proteinase gene 
from T. harzianum was cloned for the first time. The cloning and 
analysis of secreted aspartic proteinase gene possibly provide 
theoretical support for researching its structures and expressions, 
and elucidate its functional mechanisms and the relationship with 


biocontrol of T. harzianum on the level of molecular. 
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